Preparations

Load the necessary libraries

library(car)       #for regression diagnostics
library(broom)     #for tidy output
library(ggfortify) #for model diagnostics
library(sjPlot)    #for outputs
library(knitr)     #for kable
library(effects)   #for partial effects plots
library(emmeans)   #for estimating marginal means
library(ggeffects) #for plotting marginal means
library(MASS)      #for glm.nb
library(MuMIn)     #for AICc
library(tidyverse) #for data wrangling
library(modelr)    #for auxillary modelling functions
library(performance) #for residuals diagnostics
library(see)         #for plotting residuals
library(DHARMa)    #for residual diagnostics plots
library(patchwork) #grid of plots
library(scales)    #for more scales

Scenario

Here is a modified example from Quinn and Keough (2002). Day and Quinn (1989) described an experiment that examined how rock surface type affected the recruitment of barnacles to a rocky shore. The experiment had a single factor, surface type, with 4 treatments or levels: algal species 1 (ALG1), algal species 2 (ALG2), naturally bare surfaces (NB) and artificially scraped bare surfaces (S). There were 5 replicate plots for each surface type and the response (dependent) variable was the number of newly recruited barnacles on each plot after 4 weeks.

Six-plated barnacle

Format of day.csv data files

TREAT BARNACLE
ALG1 27
.. ..
ALG2 24
.. ..
NB 9
.. ..
S 12
.. ..
TREAT Categorical listing of surface types. ALG1 = algal species 1, ALG2 = algal species 2, NB = naturally bare surface, S = scraped bare surface.
BARNACLE The number of newly recruited barnacles on each plot after 4 weeks.

Read in the data

As we are going to treat Treatment as a categorical predictor, we will specifically declare it as such straight after importing the data.

day = read_csv('../public/data/day.csv', trim_ws=TRUE)
glimpse(day)
## Rows: 20
## Columns: 2
## $ TREAT    <chr> "ALG1", "ALG1", "ALG1", "ALG1", "ALG1", "ALG2", "ALG2", "ALG2…
## $ BARNACLE <dbl> 27, 19, 18, 23, 25, 24, 33, 27, 26, 32, 9, 13, 17, 14, 22, 12…
day <- day %>%
  mutate(TREAT = factor(TREAT))

Exploratory data analysis

Model formula: \[ y_i \sim{} \mathcal{Pois}(\lambda_i)\\ \mu_i = \boldsymbol{\beta} \bf{X_i} \]

where \(\boldsymbol{\beta}\) is a vector of effects parameters and \(\bf{X}\) is a model matrix representing the intercept and treatment contrasts for the effects of Treatment on barnacle recruitment.

ggplot(day, aes(y=BARNACLE, x=TREAT)) +
    geom_boxplot() + 
    geom_point(color='red')

ggplot(day, aes(y=BARNACLE, x=TREAT)) +
    geom_violin() +
    geom_point(color='red')

Fit the model

Model validation

Partial plots

Model investigation / hypothesis testing

Predictions

Post-hoc test (Tukey’s)

Planned contrasts

Define your own

Compare:

  1. ALG1 vs ALG2
  2. NB vs S
  3. average of ALG1+ALG2 vs NB+S

Summary figures

References

Quinn, G. P., and K. J. Keough. 2002. Experimental Design and Data Analysis for Biologists. London: Cambridge University Press.